refactor(examples): consolidate vlm_ptq into llm_ptq by Edwardf0t1 · Pull Request #1705 · NVIDIA/Model-Optimizer

Edwardf0t1 · 2026-06-12T22:47:53Z

What does this PR do?

Type of change: refactor / deprecation (examples)

examples/vlm_ptq was effectively a thin wrapper over examples/llm_ptq: its
scripts/huggingface_example.sh already sourced llm_ptq/scripts/parser.sh and
called llm_ptq/hf_ptq.py, and all the actual VLM logic (vision-tower exclusion,
--calib_with_images, Nemotron VL calibration, VILA loading, multimodal export)
already lives under llm_ptq. The wrapper also referenced a
requirements-vila.txt that did not exist in the repo.

This PR makes llm_ptq the single source of truth for both LLM and VLM PTQ and
deprecates vlm_ptq.

llm_ptq (canonical):

Add --vlm and --calib_with_images flags to scripts/parser.sh and
scripts/huggingface_example.sh. --vlm bootstraps VILA dependencies and runs
the TensorRT-LLM multimodal quickstart as the deploy smoke test (instead of the
text-only run_tensorrt_llm.py).
Add examples/llm_ptq/requirements-vila.txt (fixes the previously broken
reference).
Document the VLM support matrix and the --vlm workflow in README.md.

vlm_ptq (deprecated):

Replace scripts/huggingface_example.sh with a shim that prints a deprecation
warning and forwards to the llm_ptq script with --vlm.
Convert README.md into a redirect/migration notice.
Repoint root README.md VLM links and add a CHANGELOG.rst deprecation entry.

Usage

cd examples/llm_ptq
# VLM PTQ (was: examples/vlm_ptq/scripts/huggingface_example.sh)
scripts/huggingface_example.sh --model <hf_model> --quant fp8 --vlm

# VLM image-text calibration
scripts/huggingface_example.sh --model <hf_model> --quant nvfp4 --vlm --calib_with_images --trust_remote_code

Testing

bash -n syntax check on the modified parser.sh, llm_ptq script, and the
vlm_ptq shim.
pre-commit run --files <changed files> passes.
The existing VLM example test (tests/examples/vlm_ptq/test_qwen_vl.py via
run_vlm_ptq_command) still exercises the path end-to-end through the
deprecation shim, which forwards to the consolidated llm_ptq script.

Before your PR is "Ready for review"

Is this change backward compatible?: ✅ (old vlm_ptq entry point still works via a forwarding shim)
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
Did you write any new necessary tests?: N/A (existing VLM test still covers the consolidated path)
Did you update Changelog?: ✅
Did you get Claude approval on this PR?: ❌

Additional Information

Follow-up (later release): remove the examples/vlm_ptq directory and its CI
matrix entry once external references have migrated.

Summary by CodeRabbit

Release Notes

New Features
- Consolidated VLM quantization into examples/llm_ptq with a --vlm flag.
- Added image-text pair calibration via --calib_with_images, including VLM multimodal smoke-test coverage.
Deprecations
- examples/vlm_ptq is deprecated; its entry point now forwards to examples/llm_ptq with a warning.
- VILA/NVILA VLM support removed due to compatibility conflicts.
Documentation
- Updated READMEs and support matrices with VLM quantization behavior and TensorRT-LLM export notes.
Tests / CI
- Adjusted example tests and workflow matrices to stop running vlm_ptq.

VLM PTQ already ran entirely through examples/llm_ptq (the vlm_ptq shell script sourced llm_ptq/parser.sh and called llm_ptq/hf_ptq.py), so the vlm_ptq example was effectively a thin, partially-broken wrapper. Make llm_ptq the single source of truth for both LLM and VLM PTQ: - Add --vlm and --calib_with_images flags to scripts/parser.sh and scripts/huggingface_example.sh. --vlm bootstraps VILA deps and runs the TRT-LLM multimodal quickstart as the deploy smoke test. - Add examples/llm_ptq/requirements-vila.txt (the vlm_ptq script referenced a requirements-vila.txt that never existed in the repo). - Document the VLM support matrix and --vlm workflow in llm_ptq/README.md. Deprecate examples/vlm_ptq: - Replace its huggingface_example.sh with a shim that warns and forwards to the llm_ptq script with --vlm (backward compatible). - Turn its README into a redirect/migration notice. - Repoint root README VLM links and add a CHANGELOG deprecation entry. Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>

coderabbitai · 2026-06-12T22:48:06Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ab42be29-d114-4b8a-a732-28099b42cbc7

📥 Commits

Reviewing files that changed from the base of the PR and between 1b66de7 and 7397fe1.

📒 Files selected for processing (3)

CHANGELOG.rst
README.md
examples/llm_ptq/example_utils.py

✅ Files skipped from review due to trivial changes (2)

README.md
CHANGELOG.rst

🚧 Files skipped from review as they are similar to previous changes (1)

examples/llm_ptq/example_utils.py

📝 Walkthrough

Walkthrough

Consolidates examples/vlm_ptq into examples/llm_ptq by adding --vlm and --calib_with_images flags to the shared parser and launcher scripts, removing VILA-specific loading paths from example_utils.py, converting the old vlm_ptq entry point to a deprecated forwarding shim, and updating documentation, CI, and tests accordingly.

Changes

VLM PTQ Consolidation into LLM PTQ

Layer / File(s)	Summary
New `--vlm` and `--calib_with_images` flags in parser and launcher `examples/llm_ptq/scripts/parser.sh`, `examples/llm_ptq/scripts/huggingface_example.sh`	Parser adds `--vlm` and `--calib_with_images` to the getopt list with initialization and configuration output; launcher conditionally appends `--calib_with_images` to PTQ args and branches the smoke-test stage to run `quickstart_multimodal.py` for VLM runs.
VILA removal and `get_model` refactoring `examples/llm_ptq/example_utils.py`	Removes `sys` import and all VILA-specific `sys.path`/LLaVA import blocks; unconditionally applies `dtype=auto`, reorganizes sequential device-map and memory handling with a `has_pack_quantized_config()` helper, and restructures model-loading branches for speculative, pack-quantized, MXFP4, and general architectures.
`vlm_ptq` entry point converted to deprecated forwarding shim `examples/vlm_ptq/scripts/huggingface_example.sh`	Replaces 122 lines of PTQ/deployment logic with a shim that prints deprecation messages and exec-forwards all arguments to `examples/llm_ptq/scripts/huggingface_example.sh --vlm`.
LLM PTQ README extended with VLM support matrix and quantization sections `examples/llm_ptq/README.md`	Adds VLM model rows to the Hugging Face support matrix with compatibility markings; documents that VLM quantization targets only the language model via `--vlm`; includes `int8_sq` backend constraints and Nemotron VL calibration behavior; introduces "VLM quantization" subsection with command examples and "VLM calibration with image-text pairs" with `--calib_with_images` flag.
VLM PTQ README condensed to deprecation notice and migration guide `examples/vlm_ptq/README.md`	Replaces detailed content with `[Deprecated]` notice, migration instructions showing the `--vlm` invocation, a "Where things moved" table, and a reduced Resources list.
Root README, CHANGELOG, and CI workflow updates `README.md`, `CHANGELOG.rst`, `.github/workflows/example_tests.yml`	Main README VLM Quantization link updated to consolidated `llm_ptq` sections; CHANGELOG adds deprecation entry; CI example test matrix removes `vlm_ptq` from both PR and non-PR TensorRT-LLM runs.
Test utilities and VLM PTQ test updated `tests/_test_utils/examples/run_command.py`, `tests/examples/llm_ptq/test_vlm_ptq.py`	`run_vlm_ptq_command` removed; `run_llm_ptq_command` gains `vlm: bool = False` parameter; VLM PTQ test calls `run_llm_ptq_command(..., vlm=True)`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

NVIDIA/Model-Optimizer#1650: Both PRs touch examples/llm_ptq/scripts/huggingface_example.sh in the PTQ deploy/"quant" smoke-test flow, adjusting how subprocesses run for evaluation and multimodal scenarios.

Suggested reviewers

jenchen13
ChenhanYu
hychiang-git

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main refactoring objective: consolidating vlm_ptq into llm_ptq as the single source of truth.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns	✅ Passed	All Python modifications reviewed for SECURITY.md violations: no torch.load(weights_only=False), no numpy.load(allow_pickle=True), no hardcoded trust_remote_code=True (all params default False), no...

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch consolidate-vlm-ptq-into-llm-ptq

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/llm_ptq/scripts/huggingface_example.sh`:
- Around line 262-263: The shell invocation expands QUICK_START_MULTIMODAL and
SAVE_PATH unquoted which breaks when paths contain spaces; update the python3
command invocations (the lines that call python3 with QUICK_START_MULTIMODAL and
the --model_dir SAVE_PATH flag) to quote those expansions (e.g., wrap
QUICK_START_MULTIMODAL and SAVE_PATH in double quotes) and also quote any other
path-like variables used in the alternate branch at line 267 so arguments aren’t
split.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6149b08c-a48a-4a84-8cbe-8d1efb8b1ea6

📥 Commits

Reviewing files that changed from the base of the PR and between d26c8af and 3d42843.

📒 Files selected for processing (8)

CHANGELOG.rst
README.md
examples/llm_ptq/README.md
examples/llm_ptq/requirements-vila.txt
examples/llm_ptq/scripts/huggingface_example.sh
examples/llm_ptq/scripts/parser.sh
examples/vlm_ptq/README.md
examples/vlm_ptq/scripts/huggingface_example.sh

codecov · 2026-06-12T22:56:41Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.03%. Comparing base (bcd8dd4) to head (7397fe1).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1705       +/-   ##
===========================================
+ Coverage   58.45%   77.03%   +18.57%     
===========================================
  Files         510      511        +1     
  Lines       56271    56339       +68     
===========================================
+ Hits        32896    43402    +10506     
+ Misses      23375    12937    -10438

Flag	Coverage Δ
examples	`41.79% <ø> (+19.35%)`	⬆️
unit	`54.35% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cjluo-nv

Bot review — DM the bot to share feedback.

Consolidates examples/vlm_ptq into examples/llm_ptq: adds --vlm/--calib_with_images to parser.sh and huggingface_example.sh, moves the VILA bootstrap + a (previously missing) requirements-vila.txt into llm_ptq, and replaces the vlm_ptq script with an exec shim that forwards --vlm "$@". Small, net-negative diff (+142/-192).

Design review: the "design review required" gate fired only because the change spans ≥5 directories, but almost all of that is README/CHANGELOG edits. This is a deprecation/de-duplication of a thin wrapper (the old vlm_ptq script already sourced llm_ptq/parser.sh and called llm_ptq/hf_ptq.py), not a new subsystem/abstraction — it reuses the existing parser/script pattern rather than introducing a second one. The PR body justifies the consolidation well. No new framework concern.

Correctness: verified the consolidation is faithful — VILA version check/clone block, requirements-vila.txt reference (now points to llm_ptq and the file actually exists, fixing the prior broken reference), and the multimodal-quickstart deploy smoke test all match the old behavior, gated behind $VLM. Anchor links in the top-level README and the migrated vlm_ptq/README.md match the new headings. No licensing changes (the new requirements file is just a transformers<=4.50.0 pin). No prompt-injection in the untrusted blocks.

Why nudge rather than approve:

No tests directly exercise the new --vlm path through examples/llm_ptq. Coverage is currently indirect via tests/examples/vlm_ptq/test_qwen_vl.py → the shim → llm_ptq --vlm. The PR's own follow-up plan removes examples/vlm_ptq and its CI matrix entry, at which point that indirect coverage disappears unless a direct run_llm_ptq_command(..., vlm=True) test is added. Worth a maintainer deciding whether to add direct coverage now.
The VLM deploy smoke test (multimodal quickstart) is only reached for the fp8 path; int8_sq/non-Blackwell nvfp4 exit early before the deploy block. This matches old behavior (not a regression) but is worth confirming is intended.

meenchen

Is the refactor functionally equivalent?

meenchen · 2026-06-12T23:45:46Z

+LLMs — the language model is quantized while the vision encoder is kept in high precision. Pass
+`--vlm` to the shell script (see [VLM quantization](#vlm-quantization)).
+
+| Model | fp8 | int8_sq<sup>1</sup> | int4_awq | w4a8_awq<sup>2</sup> | nvfp4<sup>3</sup> |


Can we merge this list to https://github.com/NVIDIA/Model-Optimizer/tree/main/examples/llm_ptq#hugging-face-supported-models?

shengliangxu

LGTM

kevalmorabia97 · 2026-06-13T05:52:03Z

Do we want to drop Vila model support? ModelOpt min transformers is 4.56 so we cannot continue guaranteeing it works with 4.50

Sounds good - dropped Vila support.

kevalmorabia97 · 2026-06-13T05:53:54Z

Do we also want to rename examples/llm_ptq to examples/hf_ptq?

this is a good idea. Though not sure if it will be a breaking change

Can we leave a symlink from examples/llm_ptq/ to new examples/hf_ptq/ directory so previous path still remains valid and then we remove the symlink folder after few releases?

I think that probably works, but I would defer it to a follow-up PR, as we focus on soft deprecation of vlm_ptq here and the renaming would require changes in many other places.

kevalmorabia97 · 2026-06-13T05:56:40Z

We also need to merge tests and CI jobs:

tests/examples/vlm_ptq merged into tests/examples/llm_ptq

Remove vlm_ptq test job from https://github.com/NVIDIA/Model-Optimizer/blob/main/.github/workflows/example_tests.yml

… feedback) Address PR review feedback on the vlm_ptq -> llm_ptq consolidation: - Drop VILA/NVILA support: its modeling code requires transformers<=4.50.0, which conflicts with ModelOpt's minimum transformers version. Remove the VILA bootstrap (repo clone, requirements-vila.txt) and the VILA loading paths in example_utils.py. - Merge the VLM support matrix into the main "Hugging Face Supported Models" table (rows tagged (VLM)); replace the separate VLM subsection with a note. - Move the VLM example test into tests/examples/llm_ptq via run_llm_ptq_command(..., vlm=True) for direct --vlm coverage; drop the vlm_ptq CI matrix entries and remove run_vlm_ptq_command. - Quote smoke-test paths in huggingface_example.sh (CodeRabbit nit). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>

Edwardf0t1 · 2026-06-16T06:34:55Z

Is the refactor functionally equivalent?

I think so, but we dropped VILA support. See discussions in #1705 (comment)

…nto-llm-ptq Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com> # Conflicts: # README.md

coderabbitai

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/example_tests.yml:
- Line 58: The workflow matrix in the example_tests.yml file at lines 58 and 72
currently only includes llm_ptq, which removes CI coverage for the deprecated
vlm_ptq forwarding shim. Since vlm_ptq is still marked for backward
compatibility during a deprecation window and has no other CI coverage, you must
either add vlm_ptq back to the example matrix at both line 58 and line 72 (the
sibling location), or create a separate standalone smoke test for the deprecated
shim, or remove the shim entirely if the deprecation period has ended. Choose
the appropriate approach based on the deprecation timeline for this component.

In `@examples/llm_ptq/example_utils.py`:
- Around line 656-665: The `has_pack_quantized_config` function assumes
`quantization_config` is always a dict by calling `.get()` on it, but in Hugging
Face configs it can be either a dict or a config object. Follow the pattern used
in `get_original_hf_quant_method()` which handles both cases, applying the same
defensive approach to both the top-level `quantization_config` check and the
nested `text_config.quantization_config` check. Use appropriate type checking
and attribute access methods to safely retrieve the "format" value regardless of
whether it is stored as a dict key or an object attribute.
- Around line 706-708: The code accesses hf_config.architectures[0] without
verifying that the architectures list is not empty. Even though there may be a
check for None earlier in the code, an empty list would still cause an
IndexError at this index access. Add a length check to ensure
hf_config.architectures has at least one element before accessing the first
element at index 0, and handle the case where the list is empty appropriately
(either by raising a more informative error or providing a fallback value).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 05235eca-5c7f-432d-a1f9-bc1ce79cc3fd

📥 Commits

Reviewing files that changed from the base of the PR and between 3d42843 and 1b66de7.

📒 Files selected for processing (10)

.github/workflows/example_tests.yml
CHANGELOG.rst
README.md
examples/llm_ptq/README.md
examples/llm_ptq/example_utils.py
examples/llm_ptq/scripts/huggingface_example.sh
examples/vlm_ptq/README.md
tests/_test_utils/examples/run_command.py
tests/examples/llm_ptq/test_vlm_ptq.py
tests/examples/vlm_ptq/_extensions/test_torch_extensions.py

💤 Files with no reviewable changes (1)

tests/examples/vlm_ptq/_extensions/test_torch_extensions.py

✅ Files skipped from review due to trivial changes (2)

CHANGELOG.rst
examples/vlm_ptq/README.md

🚧 Files skipped from review as they are similar to previous changes (1)

README.md

coderabbitai · 2026-06-16T06:37:33Z

      fail-fast: false
      matrix:
-        example: [llm_ptq, vlm_ptq]
+        example: [llm_ptq]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash rg -n --hidden -C2 'vlm_ptq|run_vlm_ptq_command' .github/workflows tests examples

Repository: NVIDIA/Model-Optimizer

Length of output: 1782

Add CI coverage for the deprecated vlm_ptq forwarding shim or defer its removal.

The workflow matrices drop vlm_ptq from both TensorRT-LLM jobs. Since no other workflows or tests reference the deprecated shim at examples/vlm_ptq/scripts/huggingface_example.sh, this change removes its only CI coverage. The shim is still marked for backward-compatibility during a deprecation window—it should either remain in the test matrix, be covered by a standalone smoke test, or be removed entirely if the deprecation period has ended.

Also applies to: 72-72

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/example_tests.yml at line 58, The workflow matrix in the example_tests.yml file at lines 58 and 72 currently only includes llm_ptq, which removes CI coverage for the deprecated vlm_ptq forwarding shim. Since vlm_ptq is still marked for backward compatibility during a deprecation window and has no other CI coverage, you must either add vlm_ptq back to the example matrix at both line 58 and line 72 (the sibling location), or create a separate standalone smoke test for the deprecated shim, or remove the shim entirely if the deprecation period has ended. Choose the appropriate approach based on the deprecation timeline for this component.

coderabbitai · 2026-06-16T06:37:33Z

+    def has_pack_quantized_config(config):
+        # Check top-level quantization_config
+        if hasattr(config, "quantization_config"):
+            if config.quantization_config.get("format", None) == "pack-quantized":
+                return True
+        # Check nested text_config.quantization_config (for multi-modal models like kimi k2.5)
+        if hasattr(config, "text_config") and hasattr(config.text_config, "quantization_config"):
+            if config.text_config.quantization_config.get("format", None) == "pack-quantized":
+                return True
+        return False


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

quantization_config may be an object, not a dict, causing AttributeError.

The helper assumes quantization_config is always a dict (uses .get()), but HF configs can store it as a config object. Compare with get_original_hf_quant_method() (lines 589-595) which handles both cases.

Proposed fix

def has_pack_quantized_config(config): # Check top-level quantization_config if hasattr(config, "quantization_config"): - if config.quantization_config.get("format", None) == "pack-quantized": + qcfg = config.quantization_config + fmt = qcfg.get("format") if isinstance(qcfg, dict) else getattr(qcfg, "format", None) + if fmt == "pack-quantized": return True # Check nested text_config.quantization_config (for multi-modal models like kimi k2.5) if hasattr(config, "text_config") and hasattr(config.text_config, "quantization_config"): - if config.text_config.quantization_config.get("format", None) == "pack-quantized": + qcfg = config.text_config.quantization_config + fmt = qcfg.get("format") if isinstance(qcfg, dict) else getattr(qcfg, "format", None) + if fmt == "pack-quantized": return True return False

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/llm_ptq/example_utils.py` around lines 656 - 665, The `has_pack_quantized_config` function assumes `quantization_config` is always a dict by calling `.get()` on it, but in Hugging Face configs it can be either a dict or a config object. Follow the pattern used in `get_original_hf_quant_method()` which handles both cases, applying the same defensive approach to both the top-level `quantization_config` check and the nested `text_config.quantization_config` check. Use appropriate type checking and attribute access methods to safely retrieve the "format" value regardless of whether it is stored as a dict key or an object attribute.

coderabbitai · 2026-06-16T06:37:33Z

+    else:
+        architecture = hf_config.architectures[0]



⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Potential IndexError if architectures is an empty list.

Line 707 accesses hf_config.architectures[0] without checking for an empty list. While is_speculative() guards against None, an empty architectures list would still reach this branch and raise IndexError.

Proposed fix

else: + if not hf_config.architectures: + raise ValueError(f"Model config at {ckpt_path} has no architectures defined") architecture = hf_config.architectures[0]

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/llm_ptq/example_utils.py` around lines 706 - 708, The code accesses hf_config.architectures[0] without verifying that the architectures list is not empty. Even though there may be a check for None earlier in the code, an empty list would still cause an IndexError at this index access. Add a length check to ensure hf_config.architectures has at least one element before accessing the first element at index 0, and handle the case where the list is empty appropriately (either by raising a more informative error or providing a fallback value).

Edwardf0t1 requested review from a team as code owners June 12, 2026 22:47

Edwardf0t1 requested review from ajrasane, kevalmorabia97 and meenchen June 12, 2026 22:47

Edwardf0t1 requested review from chadvoegele, cjluo-nv, shengliangxu and yueshen2016 June 12, 2026 22:52

coderabbitai Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread examples/llm_ptq/scripts/huggingface_example.sh Outdated

cjluo-nv reviewed Jun 12, 2026

View reviewed changes

meenchen reviewed Jun 12, 2026

View reviewed changes

shengliangxu approved these changes Jun 13, 2026

View reviewed changes

kevalmorabia97 reviewed Jun 13, 2026

View reviewed changes

Edwardf0t1 requested a review from a team as a code owner June 16, 2026 06:31

Merge remote-tracking branch 'origin/main' into consolidate-vlm-ptq-i…

7397fe1

…nto-llm-ptq Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com> # Conflicts: # README.md

coderabbitai Bot reviewed Jun 16, 2026

View reviewed changes

Conversation

Edwardf0t1 commented Jun 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cjluo-nv left a comment

Choose a reason for hiding this comment

Uh oh!

meenchen left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shengliangxu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Edwardf0t1 commented Jun 16, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Edwardf0t1 commented Jun 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading

codecov Bot commented Jun 12, 2026 •

edited

Loading

meenchen left a comment •

edited

Loading